38 research outputs found
Text-mining and ontologies: new approaches to knowledge discovery of microbial diversity
Microbiology research has access to a very large amount of public information
on the habitats of microorganisms. Many areas of microbiology research uses
this information, primarily in biodiversity studies. However the habitat
information is expressed in unstructured natural language form, which hinders
its exploitation at large-scale. It is very common for similar habitats to be
described by different terms, which makes them hard to compare automatically,
e.g. intestine and gut. The use of a common reference to standardize these
habitat descriptions as claimed by (Ivana et al., 2010) is a necessity. We
propose the ontology called OntoBiotope that we have been developing since
2010. The OntoBiotope ontology is in a formal machine-readable representation
that enables indexing of information as well as conceptualization and
reasoning.Comment: 5 page
Extracting lay paraphrases of specialized expressions from monolingual comparable medical corpora
Whereas multilingual comparable corpora have been used to identify translations of words or terms, monolingual corpora can help identify paraphrases. The present work addresses paraphrases found between two different discourse types: specialized and lay texts. We therefore built comparable corpora of specialized and lay texts in order to detect equivalent lay and specialized expressions. We identified two devices used in such paraphrases: nominalizations and neo-classical compounds. The results showed that the paraphrases had a good precision and that nominalizations were indeed relevant in the context of studying the differences between specialized and lay language. Neo-classical compounds were less conclusive. This study also demonstrates that simple paraphrase acquisition methods can also work on texts with a rather small degree of similarity, once similar text segments are detected
Detecting negation of medical problems in French clinical notes
International audienceabstrac
Detecting negation of medical problems in French clinical notes
International audienceabstrac
Design of an extensive information representation scheme for clinical narratives
Background: Knowledge representation frameworks are essential to the understanding of complex biomedical processes, and to the analysis of biomedical texts that describe them. Combined with natural language processing (NLP), they have the potential to contribute to retrospective studies by unlocking important phenotyping information contained in the narrative content of electronic health records (EHRs). This work aims to develop an extensive information representation scheme for clinical information contained in EHR narratives, and to support secondary use of EHR narrative data to answer clinical questions. Methods: We review recent work that proposed information representation schemes and applied them to the analysis of clinical narratives. We then propose a unifying scheme that supports the extraction of information to address a large variety of clinical questions. Results: We devised a new information representation scheme for clinical narratives that comprises 13 entities, 11 attributes and 37 relations. The associated annotation guidelines can be used to consistently apply the scheme to clinical narratives and are https://cabernet.limsi.fr/annotation_ guide_ for_ the_ merlot_ french_ clinical_corpus-Sept2016.pdf. Conclusion: The information scheme includes many elements of the major schemes described in the clinical natural language processing literature, as well as a uniquely detailed set of relations